{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Comparison of F-test and mutual information\n\n\nThis example illustrates the differences between univariate F-test statistics\nand mutual information.\n\nWe consider 3 features x_1, x_2, x_3 distributed uniformly over [0, 1], the\ntarget depends on them as follows:\n\ny = x_1 + sin(6 * pi * x_2) + 0.1 * N(0, 1), that is the third features is\ncompletely irrelevant.\n\nThe code below plots the dependency of y against individual x_i and normalized\nvalues of univariate F-tests statistics and mutual information.\n\nAs F-test captures only linear dependency, it rates x_1 as the most\ndiscriminative feature. On the other hand, mutual information can capture any\nkind of dependency between variables and it rates x_2 as the most\ndiscriminative feature, which probably agrees better with our intuitive\nperception for this example. Both methods correctly marks x_3 as irrelevant.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(__doc__)\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn.feature_selection import f_regression, mutual_info_regression\n\nnp.random.seed(0)\nX = np.random.rand(1000, 3)\ny = X[:, 0] + np.sin(6 * np.pi * X[:, 1]) + 0.1 * np.random.randn(1000)\n\nf_test, _ = f_regression(X, y)\nf_test /= np.max(f_test)\n\nmi = mutual_info_regression(X, y)\nmi /= np.max(mi)\n\nplt.figure(figsize=(15, 5))\nfor i in range(3):\n    plt.subplot(1, 3, i + 1)\n    plt.scatter(X[:, i], y, edgecolor='black', s=20)\n    plt.xlabel(\"$x_{}$\".format(i + 1), fontsize=14)\n    if i == 0:\n        plt.ylabel(\"$y$\", fontsize=14)\n    plt.title(\"F-test={:.2f}, MI={:.2f}\".format(f_test[i], mi[i]),\n              fontsize=16)\nplt.show()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}